110 research outputs found

    Linear discriminant analysis for the small sample size problem: an overview

    Get PDF
    Dimensionality reduction is an important aspect in the pattern classification literature, and linear discriminant analysis (LDA) is one of the most widely studied dimensionality reduction technique. The application of variants of LDA technique for solving small sample size (SSS) problem can be found in many research areas e.g. face recognition, bioinformatics, text recognition, etc. The improvement of the performance of variants of LDA technique has great potential in various fields of research. In this paper, we present an overview of these methods. We covered the type, characteristics and taxonomy of these methods which can overcome SSS problem. We have also highlighted some important datasets and software/packages

    A deterministic approach to regularized linear discriminant analysis

    Get PDF
    The regularized linear discriminant analysis (RLDA) technique is one of the popular methods for dimensionality reduction used for small sample size problems. In this technique, regularization parameter is conventionally computed using a cross-validation procedure. In this paper, we propose a deterministic way of computing the regularization parameter in RLDA for small sample size problem. The computational cost of the proposed deterministic RLDA is significantly less than the cross-validation based RLDA technique. The deterministic RLDA technique is also compared with other popular techniques on a number of datasets and favorable results are obtained

    Rotational linear discriminant analysis using Bayes Rule for dimensionality reduction

    Get PDF
    Linear discriminant analysis (LDA) finds an orientation that projects high dimensional feature vectors to reduced dimensional feature space in such a way that the overlapping between the classes in this feature space is minimum. This overlapping is usually finite and produces finite classification error which is further minimized by rotational LDA technique. This rotational LDA technique rotates the classes individually in the original feature space in a manner that enables further reduction of error. In this paper we present an extension of the rotational LDA technique by utilizing Bayes decision theory for class separation which improves the classification performance even further

    Design and implementation of fuzzy based control system for natural gas pipes system based on LabVIEW

    Get PDF
    The quality of Natural Gas Piping Systems (NGPS) must be ensured against any manufacturing defects. For this purpose, we develop a special testing machine (STM) constructed at the lab to test (NGPS). The proposed (STM) function is based on testing the weak points at the pipe connections e.g. pipe bends, and intermediate connections. For more than 1500 pieces of (NGPS), crack propagation simultaneously followed up and monitored on the output screen at the critical positions of the pipelines connections. The control system utilizes the LabVIEW tools for various signals acquisition and monitoring also for designing the control system strategy

    A filter based feature selection algorithm using null space of covariance matrix for DNA microarray gene expression data

    Get PDF
    We propose a new filter based feature selection algorithm for classification based on DNA microarray gene expression data. It utilizes null space of covariance matrix for feature selection. The algorithm can perform bulk reduction of features (genes) while maintaining the quality information in the reduced subset of features for discriminative purpose. Thus, it can be used as a pre-processing step for other feature selection algorithms. The algorithm does not assume statistical independency among the features. The algorithm shows promising classification accuracy when compared with other existing techniques on several DNA microarray gene expression datasets

    MoRFPred-plus: Computational Identification of MoRFs in Protein Sequence using physicochemical properties and HMM profiles

    Get PDF
    Intrinsically Disordered Proteins (IDPs) lack stable tertiary structure and they actively participate in performing various biological functions. These IDPs expose short binding regions called Molecular Recognition Features (MoRFs) that permit interaction with structured protein regions. Upon interaction they undergo a disorder-to-order transition as a result of which their functionality arises. Predicting these MoRFs in disordered protein sequences is a challenging task. In this study, we present MoRFpred-plus, an improved predictor over our previous proposed predictor to identify MoRFs in disordered protein sequences. Two separate independent propensity scores are computed via incorporating physicochemical properties and HMM profiles, these scores are combined to predict final MoRF propensity score for a given residue. The first score reflects the characteristics of a query residue to be part of MoRF region based on the composition and similarity of assumed MoRF and flank regions. The second score reflects the characteristics of a query residue to be part of MoRF region based on the properties of flanks associated around the given residue in the query protein sequence. The propensity scores are processed and common averaging is applied to generate the final prediction score of MoRFpred-plus. Performance of the proposed predictor is compared with available MoRF predictors, MoRFchibi, MoRFpred, and ANCHOR. Using previously collected training and test sets used to evaluate the mentioned predictors, the proposed predictor outperforms these predictors and generates lower false positive rate. In addition, MoRFpred-plus is a downloadable predictor, which makes it useful as it can be used as input to other computational tools

    Detecting TCP SYN Flood Attack in the Cloud

    Get PDF
    In this paper, an approach to protecting virtual machines (VMs) against TCP SYN flood attack in a cloud environment is proposed. An open source cloud platform Eucalyptus is deployed and experimentation is carried out on this setup. We investigate attacks emanating from one VM to another in a multi-tenancy cloud environment. Various scenarios of the attack are executed on a webserver VM. To detect such attacks from a cloud provider’s perspective, a security mechanism involving a packet sniffer, feature extraction process, a classifier and an alerting component is proposed and implemented. We experiment with k-nearest neighbor and artificial neural network for classification of the attack. The dataset obtained from the attacks on the webserver VM is passed through the classifiers. The artificial neural network produced a F1 score of 1 with the test cases implying a 100% detection accuracy of the malicious attack traffic from legitimate traffic. The proposed security mechanism shows promising results in detecting TCP SYN flood attack behaviors in the cloud

    Predicting MoRFs in protein sequences using HMM profiles

    Get PDF
    Background: Intrinsically Disordered Proteins (IDPs) lack an ordered three-dimensional structure and are enriched in various biological processes. The Molecular Recognition Features (MoRFs) are functional regions within IDPs that undergo a disorder-to-order transition on binding to a partner protein. Identifying MoRFs in IDPs using computational methods is a challenging task. Methods: In this study, we introduce hidden Markov model (HMM) profiles to accurately identify the location of MoRFs in disordered protein sequences. Using windowing technique, HMM profiles are utilised to extract features from protein sequences and support vector machines (SVM) are used to calculate a propensity score for each residue. Two different SVM kernels with high noise tolerance are evaluated with a varying window size and the scores of the SVM models are combined to generate the final propensity score to predict MoRF residues. The SVM models are designed to extract maximal information between MoRF residues, its neighboring regions (Flanks) and the remainder of the sequence (Others). Results: To evaluate the proposed method, its performance was compared to that of other MoRF predictors; MoRFpred and ANCHOR. The results show that the proposed method outperforms these two predictors. Conclusions: Using HMM profile as a source of feature extraction, the proposed method indicates improvement in predicting MoRFs in disordered protein sequence

    Improving protein fold recognition using the amalgamation of evolutionary-based and structural-based information

    Get PDF
    Deciphering three dimensional structure of a protein sequence is a challenging task in biological science. Protein fold recognition and protein secondary structure prediction are transitional steps in identifying the three dimensional structure of a protein. For protein fold recognition, evolutionary-based information of amino acid sequences from the position specific scoring matrix (PSSM) has been recently applied with improved results. On the other hand, the SPINE-X predictor has been developed and applied for protein secondary structure prediction. Several reported methods for protein fold recognition have only limited accuracy. In this paper, we have developed a strategy of combining evolutionary-based information (from PSSM) and predicted secondary structure using SPINE-X to improve protein fold recognition. The strategy is based on finding the probabilities of amino acid pairs (AAP). The proposed method has been tested on several protein benchmark datasets and an improvement of 8.9% recognition accuracy has been achieved. We have achieved, for the first time over 90% and 75% prediction accuracies for sequence similarity values below 40% and 25%, respectively. We also obtain 90.6% and 77.0% prediction accuracies, respectively, for the Extended Ding and Dubchak and Taguchi and Gromiha benchmark protein fold recognition datasets widely used for in the literature

    Application of cepstrum analysis and linear predictive coding for motor imaginary task classification

    Get PDF
    In this paper, classification of electroencephalography (EEG) signals of motor imaginary tasks is studied using cepstrum analysis and linear predictive coding (LPC). The Brain-Computer Interface (BCI) competition III dataset IVa containing motor imaginary tasks for right hand and foot of five subjects are used. The data was preprocessed by applying whitening and then filtering the signal followed by feature extraction. A random forest classifier is then trained using the cepstrum and LPC features to classify the motor imaginary tasks. The resulting classification accuracy is found to be over 90%. This research shows that concatenating appropriate different types of features such as cepstrum and LPC features hold some promise for the classification of motor imaginary tasks, which can be helpful in the BCI context
    • …
    corecore